Probabilistic and exact frequent subtree mining in graphs beyond forests

نویسندگان
چکیده

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Probabilistic Frequent Subtree Kernels

Graph kernels have become a well-established approach in graph mining. One of the early graph kernels, the frequent subgraph kernel, is based on embedding the graphs into a feature space spanned by the set of all frequent connected subgraphs in the input graph database. A drawback of this graph kernel is that the preprocessing step of generating all frequent connected subgraphs is computational...

متن کامل

Frequent Subtree Mining - An Overview

Mining frequent subtrees from databases of labeled trees is a new research field that has many practical applications in areas such as computer networks, Web mining, bioinformatics, XML document mining, etc. These applications share a requirement for the more expressive power of labeled trees to capture the complex relations among data entities. Although frequent subtree mining is a more diffic...

متن کامل

Min-Hashing for Probabilistic Frequent Subtree Feature Spaces

We propose a fast algorithm for approximating graph similarities. Here, the similarity between two graphs is defined by the Jaccard-similarity of their images in a binary feature space spanned by the set of frequent subtrees generated for some training dataset. While being an adequate choice for many similarity based learning tasks, this approach su↵ers from severe computational limitations. In...

متن کامل

Quantitative analysis of treebanks using frequent subtree mining methods

The first task of statistical computational linguistics, or any other type of datadriven processing of language, is the extraction of counts and distributions of phenomena. This is much more difficult for the type of complex structured data found in treebanks and in corpora with sophisticated annotation than for tokenized texts. Recent developments in data mining, particularly in the extraction...

متن کامل

Efficiently Methods for Embedded Frequent Subtree Mining on Biological Data

As a technology based on database, statistics and AI, data mining provides biological research a useful information analyzing tool. The key factors which influence the performance of biological data mining approaches are the large-scale of biological data and the high similarities among patterns mined. In this paper, we present an efficient algorithm named IRTM for mining frequent subtrees embe...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Machine Learning

سال: 2019

ISSN: 0885-6125,1573-0565

DOI: 10.1007/s10994-019-05779-1